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Abstract 

Computing numerical solutions to fractional differential equations can be computationally 
intensive due to the effect of non-local derivatives in which all previous time points contribute 
to the current iteration. In general, numerical approaches that depend on truncating part of 
the system history while efficient, can suffer from high degrees of error and inaccuracy. Here we 
present an adaptive time step memory method for smooth functions applied to the Griinwald- 
Letnikov fractional diffusion derivative. This method is computationally efficient and results 
in smaller errors during numerical simulations. Sampled points along the system’s history 
at progressively longer intervals are assumed to reflect the values of neighboring time points. 
By including progressively fewer points backward in time, a temporally ‘weighted’ history is 
computed that includes contributions from the entire past of the system, maintaining accuracy, 
but with fewer points actually calculated, greatly improving computational efficiency. 
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1 Introduction 


Fractional differential equations are an extension of integrals and derivatives of integer order. Over 
the last few decades fractional differential equations have played an increasing role in physics [4j[6, 
25 35 , chemistry [8 33 , and engineering l||5 22 . In particular, fractional calculus methods offer 


unique approaches for modeling complex and dynamic processes at molecular and cellular scales in 
biological and physiological systems, and there has been a recent interest in the use of fractional 
calculus methods in bioengineering and biophysical modeling 18 19 . These methods have been 

bio-electrode behavior 
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applied to modeling ultrasonic wave propagation in cancellous bone 
at the cardiac tissue interface 20 , and describing spiny dendrite neurons using a fractional cable 
equation |91. Other work has shown that endogenous lipid granules within yeast exhibit anomalous 
subdiffusion during mitotic cell division 34 


Here we investigate the numerical implementation and computational performance of a fractional 
reaction-diffusion equation. The continuous diffusion equation has been the focal point of transport 
modeling in physical and biological systems for over a hundred years, since first proposed by Adolf 
Fick in 1855. The practical motivation for the present work arose from considering cell signaling 
data between Muller neural glial cells in the neural sensory retina. The neural retina is a direct 
extension and part of the brain itself. Muller cells in the retina can communicate in a paracrine 
fashion by secreting adenosine triphosphate (ATP) that diffuses into the extracellular space that 
then triggers intracellular calcium waves. In some cases these signaling events can produce long 
range regenerative intercellular signaling in networks of cells; see 24 27 37 . In particular, from 
experimental data published in 27 , we qualitatively observed a peaked diffusion profile for ATP 


where the central cusp-like shape persisted over a longer period of time than would be expected 
for Gaussian diffusion; this is a typical characteristic of anomalous subdiffusion 26 . Therefore 


we suspected that the diffusion of ATP in this physiological system was subdiffusive in nature. 
Neurobiologically, this could ultimately have an important effect on underlying physiology and 
cellular signaling. Motivated by these observations, we decided to explore ways of simulating and 
exploring such subdiffusive dynamics using numerical methods amenable to experimental methods 
and data. Modeling subdiffusive neurophysiological transport processes necessitates the use of 
fractional differential equations or other related objects. However, the numerical implementation 
of such equations involves non-local fractional derivative operators that require taking into account 
the entire past history of the system in order to compute the state of the system at the current 
time step. This produces a significant computational burden that limits the practicality of the 
models. In the present paper we introduce an ‘adaptive time step memory’ approach for computing 
the contribution of the memory effect associated with the history of a system in a way that makes 
it both numerically and resource (i.e. computer memory) efficient and robust while maintaining 
accuracy. While our algorithms can be applied to any diffusion application modeled as a fractional 
process, they offer particular value for modeling complex processes with long histories, such as 
lengthy molecular or cellular simulations. 

Approaches for numerically approximating solutions to fractional diffusion equations have been 
extensively studied [2p0|l2pTp3p9||38l , but in general there is always a trade off between computa¬ 
tional efficiency, complexity, and the accuracy of the resultant approximations. There has also been 
a considerable amount of work done on developing fast convolution quadrature algorithms, which is 
relevant to fractional differential equations because the non-local nature of fractional calculus oper¬ 
ators results in either a continuous or discrete convolution of some form. In particular, Lubich and 
others 13 15,17,l3l] have built a generalized and broadly applicable convolution quadrature frame- 
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work to numerically approximate a continuous convolution integral with a wide array of possible 
convolution kernel functions (including oscillatory kernels, singular kernels, kernels with multiple 
time scales, and unknown kernels with known Laplace transforms), while achieving good algorith¬ 
mic performance with respect to complexity and memory requirements. However, their algorithms 
are necessarily very complicated in order to handle a wide range of functions and expressions in the 
convolution while limiting storage requirements and scaling of arithmetic operations. They involve 
approximating a continuous convolution based on numerical inversions of the Laplace transform of 
the convolution kernel function using contour integrals discretized along appropriately chosen Tal¬ 
bot contours or hyperbolas. The scaling of the number of required arithmetic operations involved in 
the convolution, and overall memory requirements, are both reduced by splitting up the continuous 
convolution integral or discrete convolution quadrature summation into a carefully chosen series of 
smaller integrals or summations, and solving a set of ordinary differential equations one time step 
at a time without storing the entire history of the solutions. For methods explicitly involving a 
discrete convolution summation, the quadrature weights are calculated using the Laplace transform 
of the original kernel and linear multistep methods for solving differential equations. FFT tech¬ 
niques can be applied to calculate these weights simultaneously and further increase the efficiency 
of the algorithm 16 . These authors have demonstrated the success of their framework by simu¬ 


lating various differential equations involving convolutions, including a one-dimensional fractional 
diffusion equation 31 .In 11 , Li takes an alternative approach from this framework and focuses 


on a fast time-stepping algorithm specifically for fractional integrals by constructing an efficient Q 
point quadrature, but does not focus on how this fits into numerical algorithms representing larger 
mathematical models. 

The methods we introduce here are also efficient, significantly reducing simulation times, com¬ 
putational overhead, and required memory resources, without significantly affecting the computed 
accuracy of the final solution. However, in contrast to broad generalized frameworks, our algorithms 
are focused on solving fractional differential equations involving nonlocal fractional derivative oper¬ 
ators. We approximate the fractional derivative based on the Griinwald-Letnikov definition instead 
of pursuing a general quadrature formula approximation to the Riemann-Liouville integral defi¬ 
nition of the fractional derivative. The Griinwald-Letnikov derivative is used in many numerical 
schemes for discretizing fractional diffusion equations from a continuous Riemann-Liouville ap¬ 
proach [31,28 38 and involves a discrete convolution between a ‘weight’ or coefficient function and 
the function for which we are interested in taking the derivative. The mathematics and theory of 
this form of a weighting function are well established in the literature 15 28 . Building on this 


foundation avoids the need for domain transformations, contour integration or involved theory. 
Our algorithms were specifically developed for real world applied diffusion and transport physics 
and engineering problems, often where there are practical measurement limitations to the types 
and quality (e.g. granularity) of the data that can be collected or observed. For situations such 
as these, our methods for handling discrete convolutions associated with fractional derivatives are 
more intuitive and accessible than other generalized mathematical methods, where it is often not 
obvious or clear how to implement and apply a generalized approach to a specific physical problem 
under a defined set of constraints. The approaches we introduce here can be incorporated with var¬ 
ious combinations of finite difference spatial discretizations and time marching schemes of a larger 
mathematical model in a straightforward way. 

The increased efficiency and memory savings in the approaches we describe here lie in the way 
the discrete summation is calculated. The conceptual basis that results in a saving of computational 
time without a huge tradeoff in accuracy is in the interpretation of the weight function as a measure 
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of the importance of the history of a system. The more recent history of the system is more 
important in determining the future state of the system, and therefore we make use of an ‘adaptive 
time step memory’ method by changing the interval of the backwards time summation in the 
Griinwald-Letnikov fractional derivative. Instead of incorporating every previous time point into 
the most current calculation, only certain points are recalled based upon their proximity to the 
current point in time, weighted according to the sparsity of the time points. This substantially 
reduces the computational overhead needed to recalculate the prior history at every time step, 
yet maintains a high level of accuracy compared to other approximation methods that make use 
of the Griinwald-Letnikov definition. We consider two adaptive step approaches - one based on 
an arithmetic sequence and another based on a power law, that when combined with a linked-list 
computational data structure approach yield 0(log2{N )) active memory requirements. This is a 
significant improvement over keeping all N steps of the system’s history. We compare our adaptive 
time step approach with the ‘short memory’ method described by Volterra 36 and Pocllubny et 
al 28 , and examine differences in simulation times and errors under similar conditions for both. In 


the last section we sketch out a ‘smart’ adaptive memory method based on a continuous form of the 
Griinwald-Letnikov that will be able to accommodate more dynamic past histories, i.e., histories 
with sharp and abrupt changes relative to the time step being considered, that produce larger error 
in the discrete methods we discuss. Finally, we note that the scope of the present work focuses on 
a detailed development of the methods and algorithms. A full theoretical analysis of stability and 
algorithmic complexity is the topic of a companion paper that will follow this one. 


2 Fractional diffusion and the Griinwald-Letnikov derivative 


2.1 Derivation of the fractional diffusion equation 

We begin by considering the standard diffusion equation with no reaction term 


du 

dt 


a s X7 2 u, 


( 2 . 1 ) 


where a s is the standard diffusion coefficient given in units distance 2 /time, 
version of equation 2.1 is given by 


The time-fractional 


<T ! u 

~din 


= ccV u. 


( 2 . 2 ) 


where here the diffusion coefficient a has fractional order units of distance 2 /time 7 . This is the 
simplest form of the fractional diffusion equation. The real values taken by 7 in equation ?? 
dictate the diffusion regimes obtained by the equation. Values of 0 < 7 < 1 result in subdiffusion, 
characterized by a slow diffusion profile relative to regular diffusion. When 7 > 1, an oscillatory 
component emerges in the diffusion profile resulting in superdiffusion. As required, when 7 = 1 
the fractional diffusion equation reduces to standard diffusion that results in a Gaussian profile. 
When 7 = 2 it reduces to the standard wave equation. As a point of discussion, we note that there 
no obvious theoretical (mathematical) limits to the value 7 can take. However, there necessarily 
exist practical limits on 7 when considering stability and maintaining order of accuracy for the 
numerical implementation and algorithms of the fractional differential equation. Here we only 
concern ourselves with a physically plausible interpretation behind the equation. We know what 
subdiffusion and superdiffusion correspond to physically, but, so far as we are aware, there is 
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no straight forward meaning of equation |2.2| when 7 > 2 that we are aware of. This may be a 
shortcoming of our current physical understanding in the sense that the physics underlying such 
cases are not yet well understood or have not yet been discovered to be connected to particular real 
world dynamical processes. So although theoretically there is no apparent limit, there is a practical 
numerical implementation limit that will depend on a desired order of accuracy and computational 
efficiency. As we eluded to in the Introduction, we leave a full theoretical analysis of the stability 
and complexity of our methods to a follow up companion paper. 

Using the relation 


and substituting into equation ?? yields 


du d 1 7 d 7 u 

(2.3) 

dt ~ at 1 " 7 dt'r 

du <9 1-7 n 

(2.4a) 

m - " 

^ = aD l ~ 1 'S/ 2 u 
at 

(2.4b) 


where Z ? 1-7 denotes the fractional derivative operator which we define and discuss at length in the 
next section. 

If we take into account consumption or generation processes, then we can rewrite the diffusion 
equation as 


— = uD 1 7 V 2 u(x,t) + f(u), x = {x!,x 2 —x N } , u (x, 0) = u 0 (x) (2.5) 

where x is an N dimensional vector and f(u) represents a consumption or generation term. For 
the specific examples we describe in this paper we implement the simplest Diric.hlet boundary 
conditions where all boundaries are set to zero. However, our method can handle various other 
boundary conditions, including Dirichlet conditions set to any values, Neumann conditions, and 
others, in a straight forward manner. In the following section we consider the fractional derivative 
operator D 1-7 . 


2.2 Definition of the Griinwald-Letnikov derivative 

The a th order fractional derivative of a function, denoted by D a f, extends the order of the differ¬ 
ential operator from the set of integers to the set of real numbers. The operator can be mathemati¬ 
cally defined in several ways, including standard descriptions like the Riemann-Liouville definition, 
the Griinwald-Letnikov definition, and others. For numerical simulations, we find the Griinwald- 
Letnikov derivative to be convenient, since it is based on the standard differential operator but 
made applicable to arbitrary order a with a discrete summation and binomial coefficient term: 


D a f(x) = lim h 


x/h 

'“EH)' 

m —0 


f(x — mh). 


Expanding the binomial coefficient yields 



a! 

m\{n — m)\ 


( 2 . 6 ) 


(2.7) 
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and expressing the factorial components of the binomial coefficient by the gamma function gives 


T(a + 1) 


m!r(a + 1 — to )' 

Combining equations |2.6| and |2.8| yields the Griinwald-Letnikov derivative for real numbers: 


v/h 


D a f(x) = lim h~ a V 
J v ’ h ^o ^ 


m =0 


(-l)"T(q+l) 

m!r(a + 1 — to) 


f(x — mh) a € R, a ^ — Ni. 


( 2 . 8 ) 


(2.9) 


It should be noted that computation of the Griinwald-Letnikov derivative requires knowledge of 
the entire past history of the system due to the summation operator. This means that this fractional 
derivative, unlike standard derivatives, is no longer local but global. Computing the value of a real 
derivative requires knowledge of the system’s past history. The derivative must take into account 
all past points from to = 0 , the current point, all the way back to the beginning point to = x/h. 
This requirement places significant computational demands on the numerical implementation of the 
Griinwald-Letnikov derivative, and is the principle technical motivation for the results presented 
here. 

In the rest of the paper we explore the implementation of equation |2.9| to a diffusion regime 
constrained to 0<7 = a <2 (as explained in section 2 . 1 ): 


x/h 


D 1 f(x) = lim h 7 ^ |r J — | ^ ^ f(x — mh) for 0 < 7 < 2. 


m =0 


?7i!T(7 + 1 — to) ‘ 


Applying equation 2.10 to the fractional diffusion equation (eq. 2.4b I yields: 

t/i 


du 

—— = hm 




dt t ->0 ' m\T(2 — 7 — m) 

m—0 v 1 ’ 


( 2 . 10 ) 


( 2 . 11 ) 


where t represents instantaneous time, and r is the time step. 
Next, define a function such that 

( i) m r (2 7 ) 


V ’(7 ,m) = 


to!T( 2 — 7 — to) 


( 2 . 12 ) 


2.12 


Given that the standard definition of the gamma function implies T(a) = ^T(a + 1), equation 
can be reduced to a recursive relationship. Consider the first four terms in the chain from m = 0 
to to = 3: 
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V>( 7 , 0 ) 


V>( 7 , 1 ) 

V>(7,2) 

^(7,3) 


(-i)°-r(2-7) _ 

(0!) • r(2 - 7 - 0) 

(-i) 1 -r(2- 7 ) _ (-i).(-i)°.r(2-7) 

(1!) • r(2 - 7 -1) (1) • (0!) • (53^1) • r(2 - 7 - 0) 

(-i) 2 -r( 2 - 7 ) _ ( -i).(-i)i.r( 2 - 7 ) 

(2!) ■ r(2 - 7 - 2) (2) . (1!) • ^2) ■ T(2 - 7 - 1) 

(-i) 3 -r(2- 7 ) _ (-1) ■ (-1) 2 -r(2- 7 ) 

(3!) ■ r(2 - 7 - 3 ) (3) • (2!) • ^ 3 ) ■ T(2 - 7 - 2) 


Examining this system of equations suggests that for any ^ 1 ( 7 , to): 




(-1) ■ (-!)(—D - T(2 - 7) 


(~ir-r(2-7) = _ 

(m!) • E (2 - 7 - to) (to) • (to - 1 )! • (2 _ 7 _ m) ■ I \2 - 7 - (m - 1 ))' 


(2.13) 


Because the function ^( 7 , to) is dependent on 7 , to— 1), an iterative relationship forms that scales 
by 


— (2—7— m ) m 


lj}( 7 , to) = — 1p( 7 , TO — 1 ) 


2 — 7 — to 


(2.14) 


This recursive function is valid for all 7 including subdiffusion, standard diffusion and superdif¬ 
fusion, so this equation is general over all regimes. Because the entire history of the system must 
be taken into account when computing the current time step, ip('y , to) is used for many m values, 
many times over the course of the simulation, and can be precomputed for values of to = 0 to 
m = N, where N is the total number of time points, resulting in a significant savings in computa¬ 
tional performance. A similar simplification has been previously discussed and used in 28 . Taking 
^( 7 , to) into account yields the final form of the fractional diffusion equation, given by 


du 


t/T 


— = lim r 7 ip(”/,m)\7 2 u(t — tot). 


(2.15) 


m =0 


We can also arrive at equation |2.15| by an alternative approach. Begin by considering again the 
Griinwald-Letnikov derivative in terms of binomial notation: 


= lim r 7 " 1 £ (- 1 )™ f 1 ?7; 7 ) f(t - mr). 

m —0 7 ' 


Applying eq. 2.16 to the fractional diffusion equation (eq. 2.4b) yields 


du 

dt 


t/r 


= lim r 7 1 a (— l) r 

T ->0 ' 


m —0 


1-7' 


V 2 u(x , t — mr) 


(2.16) 


(2.17) 


Knowing the gamma function definition of the factorial operator, a\ = T(a + 1), the binomial 
coefficient can be expanded to accommodate real numbers (also c.f. equation 2 . 8 ): 
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\ m / — 7 — m)\ miL [z — 7 — m) 

Note how combining equations |2.17| and |2.18| yields equation |2.11| as required. 

Next, defining the function 0(y, to) (equation [2d2 1 using binomial notation gives 


V>(7,w) = (-ir 


1-7 


and substituting the relation T(a) = (a — l)r(a — 1 ) into eq. 2.18 results in 

(l)m- 1 r(2 7 ) l 


0 ( 7 , w) = 


(m — l)!r(2 — 7 — (m — 1)) 


z—'y—m 


yielding the iterative relationship in equation |2. 14| above: 

2 — 7 — TO 


■ 0 ( 7 , to) = —0(7, to — 1 )- 


TO 


(2.19) 


( 2 . 20 ) 


With the substitution of 0(7, to) into eq. 2.17 we once again recover the time-fractional diffusion 
equation in the form 


du 


i/r 


— = liin t 7 1 a 0(7, to)V 2 u(s, t — tot). 


r —>-0 


( 2 . 21 ) 


m—0 


2.3 Discretization of the fractional diffusion equation. 

Using 


V 2 u( x, t — mr) = 


d 2 u{x, t — tot) d 2 u(x , t — tot) 






( 2 . 22 ) 


as the formulation in two spatial dimensions, one can discretize the function into a finite difference 
based FTCS scheme (forward time centered space) on a grid (where k = t/A t ,j = aq/A x ,l = 
X 2 /A x , A x is the grid spacing in both directions assuming an equally spaced grid, and A t is the 
time step), using the relations 130 


d 2 u(x, t — mr) 

d 2 u{x, t — mr) 
dx 2 2 

du(x, t) 
dt 


k-m _ k—m , k-m 


A 2 


u j,i +’i “"./•/ 


A 2 , 


7 ,*+ 1 _ yk 

U j,l U j,l 


X j,l~ 1 


(2.23) 


In the discrete limit where r —> A t , we approximate the fractional diffusion equation (equations 
2.15| and 2.21) with the following finite difference expression, which has a first-order approximation 
0(t) (Chapter 7 in 28 , 15 ): 



































Figure 1: Simulation results for 7 = 0.5, 0.75, 0.9, 1.0 (for traces from top to bottom) in one 
dimensional space (panel A) and time (panel B). As the subdiffusion regime is approached the 
profile becomes more and more hypergaussian. 


u j,i ~ u :,i 

A* 


A 


= *-kr E *(r<n)§ 


—m 

l 


m =0 


where 5 k L m is the finite difference kernel given by 


srk—m 

5 i,i 


(v k ~ m 


+ u k ZZ - 4^7 m + u k 7™ + u k ~ m 


j,*+i ^ 


(2.24) 


(2.25) 


Adding a consumption/generation term is straightforward in this implementation. For example, 
take an exponential decay term given by 


du R 

a'-' 3 ” 


with the complementary finite difference relation 




Incorporating eq. 2.26 into the form of eq. 2.5 results in 

^ = olD\~ 1 V 2 u - P u, 

which gives the full finite difference implementation in two dimensions 

v k+ 1 -v k a ' 1 ' -1 k 

3,1 ~ = a -hr E ^(7, m)8 k - m - pu k , 


(2.26) 


(2.27) 


(2.28) 


(2.29) 


m =0 


Figure [l] shows the results of four simulations for different values of 7 . Simulations were run 
on a 100 x 100 grid with initial conditions 1 / 50,50 = 0-1, ^ 51,50 = ^ 50,51 = ^ 49,50 = ^ 50,49 = 0.05 
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and zero elsewhere. Dirichlet boundary conditions were implemented and the simulation edge set 
to zero. Simulations were run for t = 100 seconds with a = 1, /3 = 0, and = 5. 


3 Adaptive time step memory as an arithmetic sequence 


3.1 Derivation 


In eq. 2.29 each iteration requires the re-calculation and summation of every previous time point 
convolved with ^( 7 , m). This becomes increasingly cumbersome for large times, which require 
significant numbers of computations and memory storage requirements. To address this, Podlubny 
et. al [28 introduced the ‘short memory’ principle which assumes that for a large t the role of 
the previous steps or ‘history’ of the equation become less and less important as the convolved 
^( 7 , m) shrinks towards zero. This would then result in approximating eq. 2.29 by truncating 


the backwards summation, only taking into account times on the interval [t — L,t\ instead of [0,t], 
where L is defined as the ‘memory length’ (eq. 7.4 in 28 ; Fig. [ 2 ]). While computationally efficient, 
this approach leads to errors in the final solution since not all points are counted in the summation. 
Despite the resultant errors, this numerical method represents a powerful and simple approach for 
providing a reasonable trade off between computational overhead and numerical accuracy. In the 
context of the implementation derived here, it would result in a discretization scheme given by 


nI k + 1 _ n,k 

u j,l u j,l 


. ry—l min(L/A t ,k) 

^(7, 


A 2 


)$jj m - P* 


k 

' 3,1 


m—0 


(3.1) 


As an alternative to the method of Podlubny, Ford and Simpson proposed an alternative ‘nested 
mesh’ variant that gives a good approximation to the true solution at a reasonable computational 
cost 17 . However, this method is exclusive to the Caputo fractional derivative. Here we introduce 
an approach that is applicable to the more general Griinwald-Letnikov definition. Like these other 
methods, it also shortens computational times but at the same time results in much greater accuracy 
than the use of ‘short memory.’ We achieve this by introducing the concept of an ‘adaptive memory’ 
into the Griinwald-Letnikov discretization. 

The underlying principle of the adaptive time approach is that relative to the current time 
point previous time points contribute different amounts to the summation. Values relatively closer 
to the current time point will have a greater contribution to the current numerical calculation 
than values many time points back due to the multiplier ^>(7,771). For smooth functions, as m 
increases and \ip('y,m)\ decreases, neighboring points in the summation exhibit only small differ¬ 
ences. Consequently, one can take advantage of this and utilize an ‘adaptive memory’ approach in 
which neighboring values at prior time points are grouped together in defined increments and the 
median taken as a representative contribution for the increment weighted according to the length 
of the increment to account for the skipped points. This results in fewer time points that need to 
be computed in a summation. Algorithmically, for an arbitrary number (define this as parameter 
a) of time steps back from the current time point k for which the history of the system is being 
computed, consider an initial interval [ 0 , o] for which all time points within this interval are used 
in the summation and therefore contribute to the the Griinwald-Letnikov discretization. Let sub¬ 
sequent time intervals become longer the further away from k they are and within them only every 
other d time points are used in the summation term, i.e., only the median values of a temporally 
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shifting calculation increment of length d along the current interval are considered. As subsequent 
intervals become longer, so does d, thereby requiring less points to compute (Fig. [2]). 

Definition 3.1. Let k be the current iterative time step in a reaction diffusion process for which 
a Griinwald-Letnikov discretization is being computed. Consider an arbitrary time point a in the 
history of the system backwards from k. For i £ Ni, i ^ 1, define an interval of this past history by 

/ = [a *" 1 + i, ad] (3.2) 


where Ni represents the set of natural numbers beginning from one. Given how the indices i 
are defined, the very first interval backwards from k is independent of equation 3.2 and is given 
by [0, a]. This is considered the base interval. Subsequent intervals are defined as a function 
of this base, i.e., as a function of a and determined by eq. |3.2| Let i m ax be the value of i 
such that k £ I m ax = [' 

£ = {!= [a 1 ” 1 


Q max -}- imaxi^ 

a 1 } : * £ Ni,* 1,* < 


**]. The complete set of intervals then is defined as 

J- 


Definition 3.2. For the set of intervals ( as defined in Definition 3.1, D = {d = 2i — 1 : i £ Ni, i 7 ^ 
1 } is the set of distances d by which the corresponding intervals in f are sampled at. 


Theorem 3.3. In general, for two-dimensional diffusion without consumption or generation terms 
for any interval as defined in Definition 3.1, the Griinwald-Letnikov discretization with adaptive 
time step memory is given by 


? A+i _ n.k 

u j,i u j,i 


\r 1 

A?, 






k—n 


n—0 


ic J2 (2 J2 1 P('Y’P) S j,i P 

i —2 m i =a i ~ 1 +i P—mmax+imax 


(3.3) 


where p £ Ni, and for each i (i.e. for each interval) M = {mi = a 1-1 + (2 i — 1 )r] — i + 1 : 77 £ 
Ni & To; < m max } is the set of time points over which xffyis evaluated. Since the time 
point k may be less than the full length of the last interval I max , \m max \ < \k — i max \ represents the 
maximum value in I m ax that is evaluated, i.e. the last element in the set M for I max . 

Proof. The first summation represents the basis interval and the limits of the summation operator 
imply the contribution of every time point, i.e., n £ Ni. For intervals beyond a: any arithmetic 
sequence defined by a recursive process v v = is v -1 + d, p £ Ni for some distance d , the 'i] th value in 
the sequence can be explicitly calculated as v rt = u\ + (77 — 1 )d given knowledge of the sequence’s 
starting value v-\ . For the set ( this implies that v\ = a * -1 + i and d = 2i — 1 for a given i. This 
then yields = a 1-1 + i + (77 — l)(2i — 1 ) = a 1-1 + (2 i — 1)77 — i + 1 := as required. The outer 
summation collects the summations of all intervals that belong to the set f. The last summation 
on the interval [m max + i m ax,k\ ensures that the final point(s) of the backwards summation are 
still incorporated into the computation even if the arbitrarily chosen value of a does not generate 
a final increment length that ends on k. □ 

Note that D is not explicitly needed for computing equation |3.3| because the distances d are 
implicitly taken into account by f. Using the median value of each increment avoids the need for 
interpolation between time points. The implementation of the adaptive memory method described 
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here is necessarily limited to smooth functions due to the assumption that the neighbors of the 
median values used in the summation do not vary much over the increment being considered. This 
method essentially allows for a contribution by all previous time points to the current calculation, 
yet reduces computational times by minimizing the total number of calculations. 

3.2 Comparison to short memory methods 

The results of using various L (for short memory) and interval steps a (for adaptive memory) are 
shown in Fig. [3] Increasing the values of L and a resulted in a decrease in the error of the estimated 
results but at the cost of increased computation times. Conversely, decreasing the values of L and 
a resulted in a decrease in computation times, but at the expense of accuracy. In all cases however, 
the adaptive memory method had a significantly lower error for the same simulation time, and also 
reached a clear asymptotic minimum error much faster than the minimum error achieved by the 
short memory method. In these simulations, a = 1, /3 = 0, A t = 1, A x = 10, using a 20 x 20 
grid, and ran for t = 1500 where U® q 10 = 10. The error for the ‘short memory’ method increased 
comparatively quickly and worsened as 7 —» 1. This was due to the fact that the evolution of the 
solution was initially quite fast at times near t = 0 , which were the first time points cut by the 
‘short memory’ algorithm. 


4 Adaptive time step as a power law 


While computationally efficient and intuitively straightforward, the major drawback to the adap¬ 
tive memory method computed as an arithmetic sequence, is the necessary amount of allocated 
memory. The backwards time summation grid changes with every step such that every point in the 
history needs to be retained and stored during the course of a simulation. For high dimensional 
diffusion this limits the practical applicability of the method. For example, if one were to solve 
the three dimensional diffusion equation on just a 100 x 100 x 100 set of grid points, that would 
require the storage of 1 , 000,000 data points per time step, which over the course of a long sim¬ 
ulation would overwhelm the memory capacity of most computers in addition to greatly slowing 
simulation times. The same memory issues arise when considering another popular approach used 
for solving the fractional diffusion equation, discussed in 29 , that uses an implicit matrix solution 


to simultaneously solve all points in time using a triangular strip matrix formulation. This later 
approach is very powerful but does not take advantage of any short memory algorithms. 

In this section we improve on our results and develop a framework that uses a power law-based 
rule for eliminating past history points without sacrificing numerical accuracy. While the adaptive 
memory algorithm can be used with various distributions other than an arithmetic sequence, our 
motivation for choosing a power law-based distribution is that it matches the natural power law 
dynamics of the underlying mathematics associated with the time-fractional derivative, and it is also 
easy to model and implement. We therefore considered it a good starting point for improvements 
on the original arithmetic sequence version of the adaptive memory algorithm. We note however 
that other distributions can be combined with an adaptive step approach that would result in 
varying degrees of efficiency and accuracy. This will depend on how well they match the dynamics 
associated with time-fractional diffusion. In this section though we only consider a power law-based 
rule. 

An adaptive memory approach applied to a power law distribution, in combination with a 
linked-list based method, minimizes the storage of the past history of the function. This results in 
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ip( 7 , m) (Fading system memory) 



m =0 


Full discretization 


m=k 2 ‘’I’i'h m ) S j,i m m =0 

<--- 



Short memory approach 


m=L/A t ^( 7 , ro)^, m m=0 



Adaptive memory approach 


m=k - 1 )^( 7 , m)<5£, m m =0 

<- - - 

III! . I 

(xl") i—^ i—^ i—1 

v ' weight x5 weight x3 weight xl 


Figure 2: Short memory and adaptive memory methods for estimating the the Griinwald-Letnikov 
discretization. Both approximations rely on the sharply decreasing function |t/>( 7 , to) | as m increases 
to omit points from the backwards summation. While short memory defines a sharp cut off of points, 
adaptive memory provides a weighted sampling of points for the entire history of the system. Points 
included in the computation by each method are highlighted in red. The shape of ijj is different for 
7 < 1 and 7 > 1 , but the shape of \ip\ remains a monotonically decreasing function (as to increases) 
for both cases, and remains consistent with the principle that more recent time points contribute 
(whether positively or negatively) more to the solution at the next time step, than time points 
further back in the history of the system. See text for details. 
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7 = 0.50 7 = 0.60 




7 = 0.75 7 = 0.90 



Figure 3: Comparison of the error between adaptive memory and the short memory as a function 
of the calculation time (x-axis: computation run time in seconds) expressed as a percentage error 
relative to the computed value for the full non-shortened approach (y-axis). Four different values 
of 7 are shown. 
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Full discretization 


m=k 1 / 1 ( 7 , m)^ m m =0 



Short memory approach 


m=L/A t Y, ^(7. m ) S j,l m m=0 



Adaptive memory approach 


m—k 5Z(2* - IJV’C'T’j ”* m =0 


Inii in 1111111 in 1 mu .... 
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v 7 weight x5 weight x3 weight xl 
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Minimal memory approach 


m=k Y(2' m =0 


lllllllllIllllIllllllllllI[|lllllflMlilllllllIlllllll[IMIIIllMI»IIBl««l11HII 

i=4 i=3 i=2 i=l 

weight x 8 weight x4 x2 xl 


Figure 4: Minimal memory implementation 


decreasing the amount of total memory allocated to run a simulation of N time steps from O(N) 
to 0(log2(N)), resulting in a tremendous memory savings. Given a simulation of N time steps 
and number of grid points X/A x , where X is the grid width, storing the entire past history would 
require points to be stored, which grows linearly with N. In contrast, an adaptive mesh with 

a power law growth in spacing (in this case 1,2,4...), results in j^-log^N) points being stored 
in memory, which grows with the log of the number of points N. The advantage of using such 
a power law scaling is that one can a priori calculate memory points which will not be used at 
every time step of the simulation and de-allocate the memory assigned to storing those time points. 
With the adaptive memory method, past points needed for each subsequent time step change with 
each iteration, necessitating that all time points be stored for future calculations. The use of a self 
similar power law allows the elimination of points that will not be needed at any future time in 
the calculation. A comparison of the implementation of this method with the full implementation 
using every time point, is shown in Figure |4j 
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4.1 Set theoretic implementation 

In one spatial dimension, an FTCS discretization of the fractional diffusion equation on a grid Uj 
where i = t/A t ,j = X\j A x , can be written as 


^} +1 - “5 

A t 


x m =0 


i—rn 


( c.f. equations 2.23 to 2.25 above and 301), where 


(4.1) 


<5j = « 1 -2u}+<4_ 1 ) (4.2) 

Definition 4.1. Define a well-ordered set U in time consisting of elements Uj ordered by the point 
in time i = t/A t at which that grid point occurred. The least elements in this set are then the 
points where and the greatest are the points u k , where the current time point is k = t current /A t . 
For all integers A and B, if A > B 1 then u 4 > . 

Given the numerical scheme in |4.1[ the set U can be expressed as the recursive algorithm 

u j +1 = u j +a % ( 4 - 3 ) 

x u} 


where u° is the set of initial conditions at t = 0. 

Taking advantage of this result we can state the following lemma for the short memory approach: 

k— l 

Lemma 4.2. Assume a short memory scheme. The elements u l < u A * in U for a time step i 
are not used in future computations and can be removed from U. 

The set U can then become a list of sets U l with only the necessary elements to complete the 
recursive relation in each step i. 

Proof. The short memory approach shown in Figure [4] can be written as 

uk j +1 = uk j +a Af2 Tj (4.4) 

k _ l 

{«*■ eU:u^>u ^t} 

This recursive relation can be written as 



= uj+a ^§- ^2 ,k-i)5). 

x {u)£U k } 


u k+1 

:= {u l £U k :u l >u k ~^} + {u k+1 } 

(4.5) 


with U° := { m 0 }. As the function evolves, elements in U given by u l < u k will never be used 
again and can be removed. □ 
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Numerically only the set U k needs to be stored in memory for the current time point k and all 
points after. As discussed above, by its construction adaptive memory time step as an arithmetic 
sequence necessitates a shifting of the calculated window of points, and as such the entire history 
of the system needs to be stored in memory for the calculation at all time points. However, we 
can construct an adaptive memory as a power law, such that once we know what past history 
points need to be calculated, all other points in between will never need to be calculated. This 
then requires only enough memory to store the known and computed past intervals of the systems 
history, allowing the deallocation and recovery of much of the stored memory. This results in less 
memory requirements which translates into much faster computations. 

Definition 4.3. Define a parameter 77 which determines the ‘reset interval’. This represents the 
number of points in the past history to store in the current time point set U k at each weight. 

Definition 4.4. Define weighting sets W l with elements w l ordered in the same manner as the 
sets U\ 

Algorithm 4.1. Assume an adaptive memory time step power law scheme. Assume w° = 1. 
When there are more than 77 points in the set W l of any given weight, define a subset of W l as 
the elements in W l of the given weight. Then from this subset, the weight of the least element is 
doubled, and the second lowest element is removed from W l altogether. The time marching scheme 
is given as follows: 


u ) +1 = u j + a ^I H %l){i,k-i)w l 8). 

x {u}et/ fe } 

w k+1 = 1 

W k+1 := {w i e W k } + {w fc+1 } 

U k+1 := {u* &U k } + {u k+1 } 

(4.6) 


so that when the number N elements of a given weight is N > 77 , the algorithm is condensed. 

4.2 Numerical implementation 

From an applied perspective, to keep a well ordered list of points in U we make use of linked lists 
as the data structure. A linked list is one of the fundamental data structures used in computer 
programming. It consists of nodes that can contain data and their own data structures, and also 
pointers from each node to the next and/or previous nodes (Fig. UK)- This means one node can 
be removed from the set and deallocated from memory while the order of the set is maintained. In 
our case each node is an item u of the set U describing a time point necessary for the current time 
step, with the entire list representing all points u that make up the current iteration U k . Once a 
time point u is no longer necessary for the simulation, it can be removed (Fig. [ 5 J 3 ). New computed 
time points, u k+1 are added into the list (Fig. 1 3 . The data structure is initialized as illustrated 
in Fig. [HJd. At each time step, a new structure containing u k+1 and w k+1 is added onto the end 
of the list. When the elements of a specific weight w grow larger than the limit 77 , the first two 
values of the weight to are condensed into one value, with the weight doubling and one value being 
removed from the memory. We show as an example the fourth step in a simulation when 77 = 3 
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(Fig. § 3 ). As this process iterates in time, one has a condensed list of only the time points that 
will be needed for the rest of the simulation in the list. For example, Fig. [5J 7 shows the transition 
to the 17 th step of a simulation. 


5 A smart adaptive time step memory algorithm 

Regular diffusion relies on the assumption of a Markovian process that is not necessarily present in 
natural systems. One approach to modeling these non-Markovian processes is using the fractional 
diffusion equation introduced above. Mathematically such methods have been around for decades 
but it is relatively recently that they have been applied to the natural sciences. Computationally 
efficient methods for numerically evaluating these equations are a necessity if fractional calculus 
models are to be applied to modeling real world physical and biological processes. It should be 
noted that while in this work the simulations were done in the subdiffusive regime for a simple point 
source, the methods we derive here are directly applicable to more complex sources or superdiffusion 
(7 > 1). However, complex fast-evolving past histories in the form of a forcing function ( f(u )) or 
oscillations generated in a superdiffusion regime will result in much larger errors for both the short 
and adaptive memory methods. In the case of the adaptive memory method introduced in this 
paper this is due to its ‘open-loop’-like algorithm that blindly increases the spacing between points 
in the summation as the calculation goes further back in time and which does not take into account 
the speed of evolution of the equation. Adaptive time approaches for regular differential equations 
often make the integration step size a function of the derivative, i.e., more closely spaced time steps 
are used when the function is oscillating quickly and cannot be averaged without loss of accuracy, 
and widely spaced time steps are used where the function is smooth. In the current implementation 
we have assumed that the past history function ^>( 7 , is smooth. In this last section we 

extend the original implementation to develop a ‘smart’ ‘closed-loop’-like algorithm where the step 
size of the backwards summation is dependent on the derivative of the past history function, i.e., a 
form of feedback. This optimizes the computational speed of the simulation while reducing the error 
due to the averaging of time points in the backwards summation, ultimately resulting in low errors 
for both high frequency forcing functions in the subdiffusion regime and for oscillations intrinsic to 
the superdiffusion regime. 


5.1 Approximating the discrete Griinwald-Letnikov series as a continu¬ 
ous time integral 

Our analytical approach for smart adaptive memory makes use of the continuous Griinwald-Letnikov 
integral. In this form, we can then define a minimum threshold error function based on the derivative 
that ensures that no period in the history of the system is misrepresented up to the defined error. 


Recalling equation 2.29 the discretized form of the fractional diffusion equation, the summation 
term can be defined as a Riemann sum, which in the limit as A f —> 0 approaches a continuous time 
integral. The benefits of an integral formulation is that the backwards integration would be separate 
from a defined time grid for the series, and higher order methods for integrating, such as Simpson’s 
rule, could be used. This would be impossible in discrete time since it is necessary to project and 
interpolate between points on the grid. 

Defining a Riemann sum S over an interval x\ to x n , 
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Figure 5: Overview of the use of linked list data structurjg for the algorithmic numerical implementation of adaptive 
memory time step as a power law scheme. See the text for details. 






















































































































































































































































































































































(5.1) 


S = ^2,g{yi)(xi - Xi- 1 ) 

i=0 

there is a correspondence between the discrete summation and the integral 


such that, 


u^l 1 - u k . ^ 

- = A t 

* m=0 


g{yi) 

= i’h ,m)f{u k . 

i 

= TO 

n 

= */ T 

Xi 

= m = 0 ...n 


where the width of each segment is 1. As A f —> 0, this sum gets closer and closer to, and can be 
approximated by, the continuous time integral 


^(7 ,x/A t )f(u(t - t) dr 


T —0 


which allows the discretized version to be rewritten in continuous form as 

rt 


u k+1 — u k 

= Ar i 


A t)f(u(t - T))dr 


' r—0 


(5.2) 


(5.3) 


The function f{u{t — t)) can be interpolated from the previously calculated values of u. The 
original definition of 0 ( 7 , m) is only defined for m £ No, and so needs to extend into the continuous 
domain. With an analytical continuous function representing 0, one is then free to rediscretize the 
continuous integral in the most optimal way to solve the problem efficiently. 


5.2 Extension of ^( 7 , 777 ) to the positive real domain 

We are interested in a function ^( 7 , r) that is an extension of 7 , m) into the positive real domain 
and is defined such that for all r £ No, ^( 7 , r) = ^>( 7 , m). We begin with a basic linear interpolation 
of - 0 ( 7 , to) over non-integer space, and the result is shown in Fig. [§Y for various values of 7 . While 
a linear interpolation provides a reasonable approximation of %l>, it is not a smooth function and 
it is a first order approximation that obviously does not work well for areas of if) that have a high 
second derivative (e.g., r <Cl, for 7 < 1). Since we don’t have the ability to increase the number 
of points we are basing our interpolation on, we consider other options to obtain a more accurate 
and smoother approximation. 

Next, we consider simply extending the original definition of ' 0 ( 7 , to) and expanding it to all 
points r by using the gamma function to re-express the factorial. With r extending to non-integer 
space in the positive real domain, and with the (—l) r term, we get an oscillating complex function. 
A plot of the real part of the function, 


T( 7 , r) = Real 


(~l) r r(2 - 7 ) ] 

T(r + 1 )T (2 — 7 — r) J ’ 


(5.4) 


20 







Figure 6 : Computing a continuous version of for various values of 7 . In all subplots, 7 = 

(.85, .90..., 1.10, 1.15) from the bottom trace to the top. Exact function %p is denoted by discrete 
points. A. ’F( 7 ,r) as a linear interpolation of the exact ^( 7 , m). B. Extending the definitions of 
^>( 7 , to) to all r in the positive real domain. 
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shows that this method results in a poor approximation of ip due to the oscillatory behavior (Fig. 

Another possibility for deriving a continuous function ’F is to consider a rational polynomial 
function. One can rewrite the recursive series 'tp(py,m) as a truncated approximation using the 
relation from equation|2.14| 


2 — 7 — m 


l,m ) = — 0 ( 7 , m — 1) 

m 

0 ( 7 , o) = i 

This can be written in terms of a finite product as 

(-i) m r(2- 7 ) 


0 ( 7,m) = 


-ip(-r,m - 1) 


m!r(2 — 7 — to) 
2 — 7 — to . 


n 


7 + n — 2 


n 


i+ 


7-2 


(5.5) 


Then, using a transform to show that an infinite rational function will intersect all these points, 
define a rational function so that 


T(7,r) = R aJ3 


Pa 

Qp 


(5.6) 


where 

Po+Pir+p 2 r 2 ...p a r a 

R a ,p = -:-;-n- W- ( 5 -‘) 

qo + qir + q 2 r z ...qprP 

As (a,/3) —> oo, 4>(7,r) will approach for all r £ No- 

An infinite rational expression, however,would be too costly to compute. One can truncate 
the expression however, and get a closed-form expression with a very close fit that approaches the 
analytical recursive ip( / y,m). As to —> oo, tp —> 0, which implies that p) > a for this truncation. 

Given the number of coefficients po-, qo, ■■■, qp and flexibility in choosing a and /?, there are 

multiple possible solutions to equation |5.7| But for all cases we consider the basic set of constraints 
given by the system of equations: 


iHt.o) 

= *(7,0) 


■0(7,1) 

= *(7,1) 


■0(7,2) 

= *(7,2) 


•0(7, M) 

= *(7 > M ) 

(5.8) 


where M is an integer. 

The values of a and /3 are also important considerations, since higher order polynomials will yield 
a more accurate approximation. Although the resulting function will increase in complexity along 
with accuracy, a powerful advantage of a ip —> transform that uses a finite rational polynomial 
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function is that we will obtain a continuous version of the recursive function ip, that is a closed-form 
expression. 

No matter the exact method or transform used to obtain the new function 4'( 7 , r), once derived, 
we can then directly insert it into the numerical implementation. We can drop all points in the past 
history function (^(y, m)f(uk-m)) hr the regions where the second derivative is below a certain 
threshold (i.e., where the function is slowly changing), and integrate the function on the resultant 
mesh using equation 5.3 with the substitution of the continuous ^(y, r), with r = r/A t : 


u k+1 — u k r t 

- = A t~ X , t / A t ) f (u(t — t))cLt (5.9) 

A t Jr=0 

As discussed before, this smart adaptive step extension to the original algorithm will allow us 
to minimize errors by using smaller integration step sizes when the past history function is quickly 
changing due to a fast-evolving external forcing function to the system, or oscillating behavior 
integral to the dynamics of the system itself. In addition, this approach will be able to take 
advantage of the large body of existing literature and numerical methods solutions packages for 
solving continuous equations. 
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